Reconstructing the Properties of Dark Energy using Standard Sirens 
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Future space-based gravity wave experiments such as the Big Bang Observatory (BBO), with 
their excellent projected, one sigma angular resolution, will measure the luminosity distance to a 
large number of gravity wave (GW) sources to high precision, and the redshift of the single galax- 
ies in the narrow solid angles towards the sources will provide the redshifts of the gravity wave 
sources. One sigma BBO beams contain the actual source only in 68 per cent cases; the beams that 
do not contain the source may contain a spurious single galaxy, leading to misidentification. To 
increase the probability of the source falling within the beam, larger beams have to be considered, 
decreasing the chances of finding single galaxies in the beams. Saini, Sethi and Sahni (2010) argued, 
largely analytically, that identifying even a small number of GW source galaxies furnishes a rough 
distance-redshift relation, which could be used to further resolve sources that have multiple objects 
in the angular beam. In this work we further develop this idea by introducing a self- calibrating iter- 
ative scheme which works in conjunction with Monte-Carlo simulations to determine the luminosity 
distance to GW sources with progressively greater accuracy. This iterative scheme allows one to 
determine the equation of state of dark energy to within an accuracy of a few percent for a gravity 
wave experiment possessing a beam width an order of magnitude larger than BBO (and therefore 
having a far poorer angular resolution). This is achieved with no prior information about the nature 
of dark energy from other data sets such as SN la, BAO, CMB etc. 



A remarkable property of our universe is that it is ac- 
celerating. The cause of cosmic acceleration is presently 
unknown and theorists have speculated that it might be 
due to the presence of the cosmological constant, an all 
pervasive scalar field called Quintessence, a Born-Infeld 
type scalar called the Chaplygin gas, etc. It has also been 
suggested that modifications to the gravity sector of the 
theory, such as extra dimensional 'braneworld' models or 
f{R) theories, might be responsible for cosmic accelera- 
tion. Establishing the nature and cause of cosmic accel- 
eration is clearly a paramount objective of modern cos- 
mology P, 01 . Standard candles in the form of type la 
supernovae (SNIa) and standard rulers such as baryon 
acoustic oscillations (BAO) observed in the clustering of 
galaxies, have played a key role in garnering support for 
the accelerating universe hypothesis. Standard candles 
rely on an accurate determination of the luminosity dis- 
tance to infer the expansion history and to make a case 
for cosmic acceleration. As pointed out in [3-^ a comple- 
mentary probe of the expansion history is available in the 
form of gravitational radiation emitted from compact bi- 
nary objects such as neutron star - neutron star (NS-NS) 
binaries, neutron star - black hole (NS-BH) binaries, or 
black hole - black hole (BH-BH) binaries. Indeed, it ap- 
pears that if the underlying physics behind gravitational 
radiation emitted by a NS-NS binary is well understood, 
then the luminosity distance to this object, Dl, can be 



established to a precision of about 2%, making this bi- 
nary an excellent standard siren. What is needed addi- 
tionally, to determine Dl{z), is the source redshift of the 
compact binary emitting gravitational radiation, and the 
main systematic uncertainty in this case is the possible 
misidentification of the galaxy hosting the binary object 
(see e.g. 0-01). 
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FIG. 1: This figure shows the posterior probability distri- 
bution P{z) produced by a linear model for Dl using the 
Chebyshev polynomials ((6}. The solid line shows the exact 
expression for P{z) given by (A9) of S3, while the dotted line 
shows the approximate expression for P{z), given by (A12) 
of S3, and which is used in this paper - see (|10|l . The two 
methods of determining P{z) agree very well. 

The proposed space-borne gravitational wave observa- 
tory LISA [il was expected to achieve an angular resolu- 
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tion of about 1', and the volume bounded by this angle 
is expected to contain roughly 30 objects at z ~ 1 [1]. Its 
replacement eLISA will have a considerably lower angu- 
lar resolution (several degrees) and will therefore contain 
far more objects within its field of view ,9]. Thus unless 
the galaxy hosting the binary system can be unambigu- 
ously identified by the electromagnetic afterglow of the 
merger event JJi], the large number of objects within the 
eLISA beam will compound the problem of host iden- 
tification. Clearly, then, a major source of uncertainty 
in determining D]^{z) using standard sirens is caused by 
the misidentification of the galaxy hosting the standard 
siren, and the chances for this to happen increase with the 
number of galaxies within the observational beam. Since 
gravity wave standard sirens have enormous potential for 
ascertaining the nature of dark energy, it would clearly 
be very desirable to minimize this source of systematics. 
Luckily one expects a substantial improvement in the di- 
rectional sensitivity of next generation gravitational wave 
(GW) experiments. Indeed, space observatories such as 
DECIGO_p, the Big Bang Observer (BBO) [iQ] and AS- 
TROD , currently in the planning stage, are likely to 
have a directional sensitivity of a few arc seconds or bet- 
ter, in which case one might expect only a single galaxy 
to fall within the field of view for a large fraction of ob- 
serving directions [T2j . 

Interestingly, even in the absence of an electromagnetic 
signature, the source galaxy of the GW signal can still 
be singled out from amongst the several galaxies lying 
within the observational beam if its redshift is consistent 
with the luminosity distance derived from an approxi- 
mately known cosmology. Utilizing this idea here- 
after S3) suggested an iterative scheme to identify the 
source galaxy of the (unresolved) GW signal. At the start 
of the iterative scheme, reliably identified GW sources — 
called 'gold plated' (GP) sources, following ^1^] — give a 
first estimate of the relationship between the luminosity 
distance and redshift (henceforth called the DZ relation ^ 
) . The expected number of GP sources depends crucially 
on the directional sensitivity of the experiment: good di- 
rectional sensitivity will result in a large number of GP 
sources whereas the opposite will be true for an experi- 
ment with poor sensitivity. The reason for this is simple, 
an experiment with good sensitivity will frequently have 
a single galaxy within its beam and optical follow ups 
could establish its redshift. 

However, even if one commences with fewer GP sources 
at the beginning, one can still improve the DZ relation 
iteratively as follows. For poor directional sensitivity 
(large angular uncertainty) most GW signals would be 
unresolved since several galaxies would fall within the 
(large) angular beam. However, even in this case, a 
(rough) DZ relation derived from GPs can single out one 



particular galaxy - the one which is most consistent with 
the DZ relation - to be the source. This increases the 
resolved set, thereby improving the DZ relation, and this 
procedure can now be used iteratively. As more and more 
sources are resolved, the estimate for the DZ relation im- 
proves and eventually saturates at the point when un- 
certainty in the redshift of the source is dominated by 
instrumental and lensing scatter rather than by our em- 
pirical knowledge of -Dl(z). 

S3 investigated the efficacy of this method analyti- 
cally, using the ensemble average of statistical quanti- 
ties at each step of the iteration. Analytically the it- 
eration scheme yields a recursion relation of the form 
iVj_|_i — f{Nj), where Nj is the number of resolved 
sources at the jth step of the iteration. The limiting 
number of resolved sources is then obtained by solving 
N — f{N), which is reached after an infinite number of 
essentially infinitesimal improvements. In practice, we 
expect the iterations to freeze much sooner due to Pois- 
son fluctuations since the number of resolved sources at 
each step of the iteration is in reality a rapidly decreasing 
random number, and is therefore not expected to change 
monotonically. 
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[1] Note that the DZ relation should coincide with Dl[z) for ideal- 
ized measurements. 



FIG. 2: This plot shows the number of gold plated sources 
(at the commencement of our iteration) as a function of total 
number of events. Note the excellent agreement between the 
theoretical prediction in Eq [16] (dashed line) and the simula- 
tion (filled hexagons). 

A crucial aspect of this method is the definition of the 
error box into which we expect the source to fall. The 
data only informs us that the source of the GW signal 
falls within a solid angle with a given probability] and the 
measured noisy luminosity distance to the source, along 
with the DZ relation, informs us that the source redshift 
falls within a redshift interval with a given probability. 
Therefore, given an error box we cannot be certain that 
the true source of the GW signal lies within it. S3 used 
one sigma BBO beams with one sigma redshift range (in- 
ferred from noise in the measured luminosity distance) to 
define the error box. However, the assumption of local- 
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izing the source galaxy within the one sigma error box 
is fraught with difBcuhies: if we assume Gaussian noise 
then the one sigma error box contains the true source 
only in 68 per cent of cases, the left-over 32 per cent tail 
will either contain no galaxy or a non-source galaxy. In 
the case when the error box is empty there is no prob- 
lem since it leads to an unresolved source, however, if the 
error box contains the wrong galaxy, with a redshift dif- 
ferent from the true source, the inclusion of this DZ pair 
into the GP set will bias the DZ relation. The misidenti- 
fled sources are a problem in general, but more seriously, 
their specific impact on the iterative process is to drive 
the inferred cosmology away from its true value because 
a biased DZ relation would deem non-sources as being 
closer to the biased DZ relation, causing the number of 
misidentifications to increase with iterations and thereby 
increasing the bias. On the other hand, choosing a larger 
error box to decrease the chances of misidentification in- 
creases the number of galaxies falling inside the error box, 
thereby decreasing the chance of resolving the GW sig- 
nal. Clearly, the size of the error box needs to be chosen 
in a manner that ensures that misidentifications remain 
small but not at the cost of resolving as many sources 
as possible. A useful recipe is to start with the smallest 
error box, implying the largest number of misidentified 
sources, and increasing its size until the bias becomes 
smaller than the random errors. 

On the average, the BBO/DECIGO one sigma angu- 
lar beam would not contain the true source in 32 per 
cent cases but may contain a non-source galaxy, lead- 
ing to misidentification. The only way to ensure that 
beams do contain the true source galaxy is to consider 
larger beams and regard all galaxies falling in, for exam- 
ple, nAr2BBo(^) beam (n ~ few), to be potential sources. 
This ensures that there are very few beams which do not 
contain the true source. However, as mentioned above, 
this will decrease the total resolved set and therefore 
should affect the inferred cosmological constraints. Sur- 
prisingly, the dark energy equation of state can be recon- 
structed remarkably well even for n as large as n — 10, 
as we demonstrate in this paper. 

Our main motivation for the present work is to investi- 
gate the above issues in detail through Monte-Carlo sim- 
ulation of data; and to explore choices for the error box 
that lead to a well determined DZ relation with minimal 
bias. 



I. SELF-CALIBRATION WITH MONTE-CARLO 
SIMULATED DATA 

A realistic Monte-Carlo simulation mimicking the out- 
come of the BBO/DECIGO experiment requires a careful 
consideration of the astrophysical aspects of the problem. 
First, we have to assume something about the type of 
galaxies that would host NS-NS binary systems, and their 
abundance, to derive the rate of NS-NS merger events as 
a function of redshift. The number of resolved sources 



as a function of redshift decides how well the DZ rela- 
tion is established, which in turn decides the precision 
with which one can reconstruct the properties of dark 
energy. A detailed prescription for sources of GW sig- 
nals is presently not well known and depends on theo- 
ries of stellar and galactic evolution, and in this work we 
continue to use the rates used in Eq 2 of [l^]. However, 
the main issue we are investigating here is the efficacy of 
identification of the source galaxies for GW signals using 
the DZ relation. Therefore this prescription is entirely 
adequate for our purposes. We believe that the achiev- 
able precision on cosmology quoted in our paper is likely 
to be fairly representative. 

For localizing the source galaxy, the prime uncertainty 
is due to the presence of other galaxies in the error 
box; therefore, it is also necessary to assume something 
about the spatial distribution of galaxies. If the galaxies 
are clustered in redshift then identification of the source 
galaxy becomes relatively more difficult. It is well known 
that galaxies cluster in a complex manner, with the clus- 
tering depending both on galaxy type as well as redshift. 
Using the two-point correlation function, S3 gave an esti- 
mate of the effect of clustering on the number of galaxies 
that are expected to fall within an error box. Basically, 
the net effect is to increase the number of galaxies in 
the vicinity of the actual source galaxy since the source 
galaxy is more likely to lie in a clustered environment. 
Although it is desirable to simulate data taking into ac- 
count the clustering of galaxies, this turns out to be a dif- 
ficult task. For the purposes of the present paper we shall 
neglect this effect and use the approximation that galax- 
ies are distributed uniformly randomly at each redshift 
(our method is described in greater detail below). As our 
previous analysis of the two-point correlation function in- 
dicates [15| , this will lead to slightly optimistic estimates 
of the final achievable cosmological constraints. 



A. Simulating Data 

The simulated data described below has for each GW 
signal an associated: redshift of the source galaxy; red- 
shifts of some (or none) non-source galaxies; the angular 
beam size at the source redshift; and the noisy distance 
estimate along with an estimate for the error in the dis- 
tance. We call this data, collectively associated with a 
single source, as a pencil. We earlier mentioned that if 
one considers only one sigma angular resolution then not 
all BBO beams will contain the source galaxy. In our 
simulation each pencil, by construction, contains a galaxy 
at the source redshift. Since a one sigma angular beam 
on an average contains the source in 68% of the cases, 
this amounts to assuming the total number of simulated 
sources to be about 50 per cent higher. Of the 32 per cent 
beams that do not contain the true source, a small per- 
centage could contain a non-source galaxy which would 
mistakenly be deemed to be a gold plated source. In our 
prescription these misidentified sources are not taken into 
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account. As we demonstrate later, even if we choose an 
angular beam of size lOAfiBBO) our method succeeds in 
resolving the redshifts of enough GW sources so that the 
implied constraints on dark energy are only marginally 
worse than what one obtains for a one sigma beam. Such 
large angular beams are almost certainly going to contain 
the source galaxy, therefore this prescription does not 
lead to any distortion in our conclusions for larger beam 
sizes. However, the results quoted for smaller angular 
beams would appear to contain smaller bias than would 
be the case had we included the misidentified sources. 

To populate non-source galaxies in the beams we divide 
the redshift range into a large number of redshift bins. 
Following jl^ . we consider sources up to z = 5. As men- 
tioned earlier, we ignore the clustering of galaxies, there- 
fore, we assume that galaxies are distributed uniformly 
randomly in each redshift bin Azbin- The mean number 
of galaxies, n, lying within the beam at a given redshift 
bin depends on the size of the bin Azbin and the angular 
size of the beam An{z), and is given by [l^, [3 [13 

n{z) ~ T^^r(z) exp [-r\z)] Af](z)Azbi„ , (1) 

where we have assumed a small Azbin, so the linear 
approximation in this equation suffices. Here r{z) = 
J^dz/h{z) is the c/Hq normalized coordinate distance, 
h{z) = H{z)/Hq and Nn = 1000 arcmin"^ is the pro- 
jected number density of galaxies consistent with the 
Hubble Ultra Deep Field [21|. Given fi, the probability 
that there be k galaxies in the bin is given by 

Pr(fc) =n'=exp(-n)/fc!. (2) 

Using a Poisson random generator we then populate each 
redshift bin with non-source galaxies. The one sigma 
BBO angular resolution, AI^bbOi ranges from 1 — 100 
arcsec^ and for this work we have adopted the redshift 
dependence of the BBO beam for NS-NS mergers from 
Fig 4 of n3\. For the BBO beam most of these bins do 
not contain any non-source galaxies since n <^ I. In prin- 
ciple, the redshift distribution of GW sources can differ 
from that of galaxies due to the redshift dependent rate 
of NS-NS mergers, which we have adopted from Eq 2 of 
[13]. Note that the NS-NS rate peaks at z = 1, which 
is close to where the galaxy density peaks, so these two 
distributions are similar to each other (also see figure H]). 

To complete the specification of data we also need 
a noisy estimate of the luminosity distance to the GW 
source. The dimensionless standard deviation (T,n{z)/DL 
is partly due to the fact that the luminosity distance to a 
GW source cannot be measured to better than about 2% 
relative accuracy (due to random instrumental noise and 
limitations of template fitting) and partly due to weak 
lensing. The dominant uncertainty is due to lensing. 
Lensing produces an asymmetric distribution of magni- 
fications: a majority of the GW sources are demagni- 
fied but a tiny fraction of them are highly magnified. 
If the redshifts of the highly magnified sources are re- 
solved independently then it is possible to handle them 



statistically; if not then such sources would either lead to 
misidentifications or would not be resolved at all. How- 
ever, for our purposes we set aside this complication^ and 
assume that the distribution is described by a symmet- 
ric Gaussian distribution with a dimensionless standard 
deviation Vwiiz) = 0.042z, that is derived from the re- 
sults of |13|- We add to the lensing standard deviation 
a fixed random instrumental/template noise with the di- 
mensionless standard deviation given by ryinst = 0.02, to 
obtain the standard error in the luminosity distance a^n 
given through 

^ = \J Vli + vLt ■ (3) 

Using a Gaussian random number generator with this 
standard deviation, we then assign a noisy distance mea- 
surement to each galaxy. 

II. USING THE DZ RELATION TO INCREASE 
THE NUMBER OF RESOLVED SOURCES 

If a pencil contains only a single galaxy {i.e. non- 
source galaxies are absent in the beam) then the host 
galaxy is obviously identified (note that, by construc- 
tion, all pencils contain the source galaxy), and so it 
can be used to portray the DZ relation at the redshift 
of the source (gold plated source). However, as men- 
tioned earlier, the luminosity distance to a GP source 
has a non-zero random error due to lensing scatter and 
measurement noise. So using a GP source one can iden- 
tify the DZ relation at the redshift of the source only to 
within a finite accuracy. 

If the number of GPs is comparable to the total num- 
ber of GW signals, then we need go no further, but as 
noted above, this will happen only if beams at all red- 
shifts are very narrow. The BBO one sigma angular reso- 
lution is sufficiently narrow for this to be the case. How- 
ever, since roughly a third of BBO beams will not contain 
the true source and might also contain a spurious one - 
thereby biasing the DZ relation - we shall consider beams 
of larger size'^ , to ensure that they more certainly contain 
the source galaxy. For larger beams the number of GPs 
could be much smaller than the total number of GW sig- 
nals, in fact, LISA/eLISA beams are so large that they 
would have no GPs. 

At this stage we have a number of GPs that can be 
used to further resolve the pencils that have more than 
the single source galaxy within them. To see how this can 
be done note that the non-source galaxies would typically 
have redshifts different from that of the source galaxy. 



[2] The asymmetry induced my lensing will be incorporated in a 
follow-up work. 

[3] It could also happen that the gravity wave observatory that is 
finally launched would have a beam width which is larger than 
that of BBO, in which case the above considerations would apply. 
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FIG. 3: Left panel displays the distribution of gold plated sources with redshift, with total number of GW sources being 300,000, 
30,000, 3,000, from left to right. Right panel is a plot of function (j>(zs) (Eq. 18). As discussed in the text, these two plots can 
be combined to obtain the fractional number of GPs for an n x BBO beam (nBBO) at a given redshift by multiplying the 
numerical value in the left panel and the numerical value in the right panel raised to power n — 1. 



For a given pencil, the (noisy) DZ relationship inferred 
from the GPs provides the probability distribution for the 
expected source redshift, P{z), for the signal. Clearly, 
the galaxy in a pencil that is closest to the peak of this 
distribution is more likely to be the actual source galaxy. 
The appendix in S3 gives a detailed Bayesian method for 
determining this probability distribution using a linear 
fitting function (linear in parameters of the model) for 
the luminosity distance. According to this method, the 
GP sources are used to infer the DZ relation by fitting 
them to a linear model for the DZ relation, namely 



N 



DL{z,h) = J2hifi{z) 



h^f, 



(4) 



where h are the N parameters of the model, ft are 
N arbitrary functions of redshift, and we have defined 
f = {/i(z), /2(^), . . . , /Ar(z)}. The number of terms in 
the fitting function is decided by the quality of data: bet- 
ter quality data requires larger N to adequately fit data. 
Moreover, the choice of functional form of fi should en- 
sure that the fitting function, for some values of param- 
eters h, is able to mimic the behavior of the target DZ 
relation. 

After constructing an appropriate fitting function, the 
model is fitted to the resolved sources and the errors on 
the parameters of the fitting function are obtained. They 
are described by the Gaussian distribution 



P(h) 



1 



(27r)^/2\/ditC 



exp 



4(h^-hJ)C-i(h-ho) 



(5) 

where C is the covariance matrix and ho are the best fit 
parameters. This fit is then used to infer the posterior 

probability distribution for the unknown source redshift 
given the noisy luminosity distance to the GW signal. 



The resulting expression is straightforward but compli- 
cated, and is given by Eq. (A6) of S3. 

The Chebyshev polynomials provide a flexible and con- 
venient linear fltting function for the luminosity distance, 



N 



DL{z,h) = ^/ijChebl(z), 



(6) 



7=1 



where Chebl(2;) is the 7th order Chebyshev polynomial. 
Depending on the quality of data, the order of fit N can 
be set at a value where the fit becomes good (in terms of 
the test). The reason for choosing Chebyshev poly- 
nomials over ordinary polynomials is that the numerical 
value of these polynomials remains bounded in the range 
(—1,1), within the range of the fit. This ensures that the 
statistical errors on the coefficients hi remain similar for 
all orders /. This is extremely important for translating 
them to errors on redshift through P{z), where the co- 
variance matrix needs to be numerically well behaved for 
inversion. 

Wc find that this linear model works well most of the 
time. But in a few high redshift cases the inferred red- 
shift peak fails to fall reasonably close to the true source 
redshift. This happens because noise in the data affects 
the fit adversely, especially at high redshifts where the 
number of resolved sources is small. The polynomial fit 
is in some sense local and does not respect the expec- 
tation of monotonicity, therefore it tries to over-fit any 
local feature produced by noise;; this produces artifacts 
that need to be handled individually, making it difficult 
to automate the process. 

Due to limitations of a linear Chebyshev polynomial 
based fit, we considered other alternatives. It is clear that 
a physics based model for the luminosity distance does 
not have these limitations. However, a crucial unknown is 
the physics governing the behavior of dark energy, which 
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is crucial for determining the luminosity distance. The 
unknown physics of dark energy is usually encapsulated 
in the form of a fluid model for dark energy, with an 
equation of state, p — wp. The unknown function w is 
then thought of as a function of redshift w{z), and can 
be parameterized in terms of a suitably versatile fitting 
function^ . For our work we follow the following recon- 
struction procedure which incorporates the CPL fitting 
hmction [l8| for w{z) given below: 



dx 



Dl{z) ^ c_ 
l + z Ho H{x) 



(7) 



where, 



w{z) 



(l-flAf)exp(3 ri±^^dz 
I Jo ^ + z 



Pde/pde ^ wo + 



WlZ 

l + z 



(8) 



with ACDM corresponding to wq — —1, wi = 0. The 
CPL ansatz produces reasonable fits if dark energy has a 
slowly varying equation of state. 

However, from the perspective of the iterative scheme 
which we develop in this paper, this fitting function has 
a problem: it depends non-linearly on the model param- 
eters 51 Af, Wo and wi. The posterior probability P{z) for 
the source redshift can be analytically obtained only if 
the luminosity distance depends linearly on its parame- 
ters (see appendix of S3). A possible remedy is to: (i) use 
the fitting function d?]), © to the simulated data to ob- 
tain the best fit parameters (JImoj Wio)', (ii) then linearize 
([7]) in Qm and Wi through a Taylor expansion about the 
best fit parameters: 



,{z,e,) = DL{z;0^o)+J2i^^-^ 



diMz^ 



where, for brevity, {0i} = {Um ,wo,wi}; the derivatives 
are evaluated at the best fit parameters 0io. This function 
is linear in parameters 9i and can be used to analytically 
marginalize over the model parameters to obtain P{z). 
If the original fit is tight, in the sense that the errors 
on parameters 9i are small, then the linearized fitting 
function i^LincarC-z, Oi) scrvcs as a close approximation to 
the original. 

It turns out that a further simplification is possible 
that makes our task much simpler. The posterior prob- 
ability distribution given in Eq. A9 of S3 is not a simple 
Gaussian distribution. The distribution has a peak at 
the redshift that the best fit model predicts for the mea- 
sured noisy distance to the GW signal but its redshift 



[4] In other approaches it is also possible to work with a fitting 
function for the Hubble parameter H{z) as discussed in Q|. 



dependence can be complicated. It is however possible 
to systematically extract its leading local behavior to ob- 
tain the Gaussian distribution 



P{z) 



271(7. 



■ exp 



{z - zof 



2gI 



(10) 



where = v ("'m + '''c)/-C'l; the standard error am 
is from Eq. |3l and CTc is the cosmology error defined 
in the last section of the Appendix in S3; Z?^ = 
dDL[z, Oio)/dz\z=za] the parameter zq is the best fit red- 
shift inferred for the GW signal and is computed from 
the prescription given there, but basically it is the red- 
shift inferred for the noisy distance estimate via the best 
fit model Dl{z; Bio). Note that due to noise in the mea- 
sured luminosity distance, zq need not coincide with the 
true source redshift. 

In Fig [1] we plot a comparison of P{z) produced by 
the local Gaussian approximation (|10p and the more ex- 
act expression A9 of S3 (both distributions are produced 
by using a Chebyshev polynomial based fit). We find 
that the Gaussian approximation agrees very well at all 
redshifts. In fact, the agreement becomes better with 
an increase in the number of resolved sources because 
the error bars on parameters become smaller, and A9 of 
S3 starts approaching a Gaussian distribution. There- 
fore, the local Gaussian approximation is adequate for 
our purposes. 



A. Error Box 

The local approximation for P{z) in (jlOp is convenient 
in that it enables us to define the error box for the source 
location in terms of the variance az of the local approxi- 
mation Eqlini Note that this task would be considerably 
more complicated had we used A9 of S3, which is not 
described by a few simple parameters as the Gaussian 
distribution is. To define our error box, let us first choose 
the beam size in which we expect the source to lie as 



Af](z) ^nAf^BBO, 



(11) 



where AJIbbo is the redshift dependent one sigma resolu- 
tion of the BBO experiment. The beam has been chosen 
as a multiple of the BBO resolution to ensure a high 
probability for the source to fall inside the beam. (This 
also accommodates the possibility that, as in the case of 
LISA — eLISA, the space mission which finally flies has 
a larger beam width than BBO.) Similarly, we can define 
the redshift range in which the source is likely to lie as 



Az = 2maz 



(12) 



which is centered at the peak of the probability distri- 
bution P{z). For simplicity, the redshift range has been 
chosen to be a multiple of the one sigma range obtained 
from P{z). 
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FIG. 4: Left panel shows the normahzed distribution of resolved GW sources at the end of the iteration. The solid hue is the 
redshift distribution of all sources assuming BBO resolution; dashed lines show the redshift distribution of resolved sources 
obtained after assuming that the source lies within la of the peak in the probability distribution P{z) defined in (|lUp . In 
other words the dashed histogram corresponds to n = 1 in and m = 1 in (|12|l . Dark gray shows the same but for a 2a 
allowed region for sources, corresponding to m = 2 in (|12|) . Finally the light gray shade reflects a 2>a allowed region for sources 
and corresponds to m = 3 in (I12p . Note that the number of resolved GW sources is comparable to the total number of GW 
sources for ra = 1 and is only slightly worse for m = 2, 3. Right panel discusses the issue of source misidentification. Clearly by 
restricting GW sources to lie within an ma region of the peak of the probability distribution P{z) we open up the possibility 
of source misidentification which is larger for smaller m. This is clearly revealed in the inset to the right panel (the shading 
scheme is the same as in the left panel). We see from this panel that fewer sources are misidentified for 3cr (light gray) than 
for la (dashed) . The main panel on the right shows the distribution of resolved sources after misidentified sources have been 
identified. Thereafter misidentified sources do not play any role in the iterative scheme which, in the case of the dashed region, 
coincides with the solid line corresponding to the redshift distribution of all GW sources. 



Iterative Scheme 



after N steps): 



We generate pencils with different values of n in (ITlT) . 
As mentioned above, larger values of n increase the 
chances of the true source of the gravity wave signal lying 
within the beam, but have the obvious side effect that the 
number of non-sources in the beam also goes up. Clearly, 
with a larger number of non-sources in each beam, the 
number of gold plated sources (CPs) in the set goes down, 
therefore, n cannot be chosen too large, otherwise there 
will be no GPs with which to start our iterative process. 
The GPs are used to derive an initial estimate of the lu- 
minosity distance which is used to infer P{z). Knowing 
P{z) we repeat the process of going through each pencil 
and checking whether any (single) galaxy is present in the 
redshift range zq — maz < z < zq + maz , where zq corre- 
sponds to the peak of P{z). Galaxies inside pencils where 
this test succeeds are referred to as resolved galaxies and 
added to the burgeoning inventory of sources. This pro- 
cess is repeated until there is no further increase in the 
number of resolved sources and the iterative method sat- 
urates. Now the final value of the luminosity distance is 
used to determine w{z) - the equation of state of dark 
energy, using ([7]) and ([5]). Our self-calibrating iterative 
scheme is summarized below (convergence being reached 



P(i)(z) 
p(^-i)(^) 



j\qp 



d['\z) 
d'^\z) 



P^^^z) 
P(^\z) 

w(z) . 
(13) 



Note that as more resolved sources are added, the shape 
of P{z) becomes peakier and narrower (smaller Uz), 
which helps in picking new resolved sources from amongst 
contender galaxies. This practice, outlined in ([T3)) . is con- 
tinued until convergence is reached and no new source 
galaxies can be added to the resolved sample. 

Let us illustrate this with an example. Supposing of 
the 1000 pencil beams that have been generated only 30 
have a single galaxy falling within them and the remain- 
ing beams contain two or more galaxies. Then we can 
safely assume that these 30 galaxies host GW sources 
and, using optical/IR observations, determine their red- 
shifts. This establishes for us our initial DZ relationship. 



namely d'"j^\z) in ([T^. where A^p = 30. Next, knowing 



(1) 



D^j^' (z) we determine the probability distribution func- 
tion, P(i)(z) = P'^^^z\D'-p), through Clearly our 
knowledge of P{z) will inform us which of the two or more 
candidate galaxies (in each of the additional 970 beams) 
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is a potential gravity wave source, since this galaxy would 
lie closer to the peak of P{z) than its companions. In- 
cluding the resolved galaxy, and others like it, will es- 
tablish an enlarged sample, and this procedure will be 
followed iteratively (for N steps) until no new resolved 
galaxies can be added, at which point we say that con- 
vergence has been reached and the final value of w{z) is 

determined from D^^'' (z) using ([7]) & ([S]). If the error box 
is small (small n and m) then it follows that the chances 
of resolving a GW signal increases since it is more likely 
that we will find only a single galaxy within the error box. 
However, a very small error box would also imply that the 
probability of the true source galaxy lying within it be 
small, leading to a larger number of misidentifications in 
this case. Misidentifications have the pernicious effect of 
biasing the DZ relation away from its true value, resulting 
in a positive feedback on the chances of misidentifications 
during later iterations and converting the biasing of the 
DZ relation into a runaway process. In the other extreme 
case when the error box is very large, the possibility of 
misidentification goes down but the chances of resolving 
GW signals goes down as well, resulting in a decrease in 
the quality of constraints obtained for cosmology. Clearly 
then, an optimum value for the error box should be such 
that the constraints on cosmology are tightest while at 
the same time the bias is below the statistical scatter. 



probability 



Pr(zs; Gold Plated) — exp 



-y^n(z,;zs) 



(14) 



In the expression for n{zi; z^) (Eq [T]), we see that it 
is proportional to the bin size Azbin. The probability 
Pr(zs; Gold Plated), however, is understood to be ob- 
tained in the limit of infinitesimal bin size, in which case 
the sum in the above expression will be replaced with an 
integration. 

For a small enough bin size, defining fs as the fraction 
of GW sources at redshift Zg, the fraction of gold plated 
GW sources at Zs is given by 



/gp(zs) = /s(2s)exp 



-y^n(z,;zs) 



(15) 



and the total number of GPs by 



Nqp = iVTotai ^ fs{zs) exp 



(16) 



III. RESULTS 

At commencement, a small number of gold plated (GP) 
sources is required in order to obtain the zeroth order 
cosmology to start our iterative process of statistical res- 
olution of GW sources. The probabilistic expectation 
value of the number of GP sources for a given number of 
total sources and the beam size can be estimated as fol- 
lows: Dividing the redshift range into a large number of 
bins, the mean number of galaxies n(zi) in a redshift bin 
can be estimated through Eq [1] using the angular beam 
AJ7(zs), where Zg is the redshift of the source galaxy; and 
Azbin, where Zi is the redshift of the bin. 

It is worth noting that the probability for a source to 
be a GP depends on the angular resolution with which 
the source can be resolved, therefore the angular resolu- 
tion of the beam is evaluated at Zg, thereby making n{zi) 
a function of Zg, which we notate explicitly below. As 
mentioned earlier, the BBO angular resolution which we 
have used is taken from Fig 4 of [l2|, corresponding to 
NS-NS mergers. Our prescription puts the actual source 
galaxy in each pencil, therefore, the probability that the 
pencil contains no other galaxy can be obtained from the 
probability that the redshift bin at Zi is empty, which 
is Pr(0) = exp[~n{zi; Zg)] (see Eq[5]). Clearly, a pen- 
cil will give a GP source if all bins are empty^ giving a 



Fig [2] shows the number of gold plated gravity wave 
sources as a function of total number of GW sources ob- 
tained for the BBO angular beam. As predicted by (fT6l) , 
the theoretical expectation that Nqp cx iVrotai is found 
to be true in our simulations. We find that even for a 
small total number of sources (~ 1000) the GP set is 
large enough (^ 100). The left panel of Fig [3] plots the 
function /qp'-'(zs). We find that the distribution is in- 
dependent of the total number of sources, in accordance 
with theoretical expectation. It is important to note that 
the distribution of sources in redshift is wide enough for 
obtaining a good starting DZ = Dl{z) relation. Note also 
that the peak of the redshift distribution /gp of GPs (at 
z ^ 0.3) does not coincide with the peak of the galaxy 
distribution [z ^ 1). This is due to the fact that there 
is an additional redshift dependence due to the redshift 
dependent beam size [l^. The BBO angular resolution is 
better at smaller redshift; however the number of sources 
first increase, peak at z ~ 1, and then decrease with the 
redshift; the net effect therefore is to shift the peak of 
the GPs redshift distribution to a lower redshift. 

For larger (than BBO) angular resolution, the number 
of GPs decreases since the beam, in many cases, becomes 
too large to accommodate only a single galaxy. This 
effect can be estimated in terms of the function /qp^*^(2) 
for Af2(z) — nAriBBo(-2), which can be easily shown to 



[5] Recall that according to our prescription an empty pencil beam 



is redefined so as to contain the source galaxy. 
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be given by 



/•nBBO 

Jgp 



(zs) = /s(zs)exp 



fi^°{z^)r-'{z,) 



exp 



«BBo(2i 



(17) 
(18) 



We find that at each redshift, the number of GPs pre- 
dicted for the nAr2BBo(-2) beam is smaller than that for 
^^BBo{z) beam by a redshift dependent multiplicative 
factor 0"~^(zs). The right panel of Fig |3] plots the func- 
tion (j){z). We see that for z > 0.1, (/) < 1 is a smaU 
fraction; therefore this factor decreases very rapidly with 
n. The multiplicative factor increases with n for z < 0.1 
but since the fraction of sources is small at small red- 
shift, this does not cause an abnormal increase in GPs at 
low redshift with increasing n. Although mathematically 
speaking this factor blows for large values of n, at a large 
enough n the Poisson statistic will cease to hold, making 
this argument invalid. 

We simulated 300, 000 pencils corresponding to three 
years of cumulative BBO data. The resolved samples 
were fitted, at various stages of iteration, with the non- 
linear model for Dl{z) described by ([7]) in which the 
CPL equation of state ([5]) has been used. The model was 
linearized over the polynomial coefficients and the local 
approximation discussed earlier was then used to derive 
P{z). The fiducial model chosen for all our simulations 
was a flat ACDM model with il„i = 0.3. In all cases 
(described below) the iterations freeze out after about 
ten steps implying that convergence had been reached, 
i.e. = 10 in p^ . 

In the left panel of Fig 2] we plot the total number of 
resolved GW sources obtained at the end of our itera- 
tive run with n = 1 in (ITU) and m — 1,2,3 in ([T^ . It 
is interesting that the number of resolved GW sources 
is comparable to the total number of GW sources for 
m — 1 and is only slightly worse for m = 2,3. The inset 
in the right panel shows the distribution of misidenti- 
fied sources (the redshift has been identified incorrectly). 
Recall that n = 1 corresponds to the Icr angular reso- 
lution for BBO and, as pointed our earlier, this implies 
that roughly a third of all beams will not contain any 
GW sources at all ! However, in our simulations we en- 
sure that all pencils do contain the true source, therefore 
figure m does not include the effect of misidentification 
of sources due to small beam size. The other impor- 
tant cause for misidentifications is due to a choice of too 
small an allowed redshift range, which is what the inset 
in figure|3]illustrates (the number of misidcntificd sources 
increases as m decreases). The fraction of misidentified 
sources peaks at z ~ 1 because, as mentioned earlier, the 
BBO beam monotonically becomes wider at larger red- 
shift and so has a larger number of non-source galaxies 
falling into it; therefore the expected peak redshift for 
misidentifications is expected to be larger than the peak 
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FIG. 5: Constraints on wq and wi with misidentified sources 
included in the resolved sample. The beam size is fixed at 
the one sigma angular resolution of BBO, corresponding to 
n = 1 in while the redshift range in which the source is 
expected to lie is given by (I12p . namely Az = 2ma^. The lined 
region corresponds to m = 1, dark gray to m = 2, and light 
gray to m = 3 in (|12|l . In all three cases the fiducial ACDM 
model {wo = —l,w\ = 0) is included within the one sigma 
contour, but the fit improves for m = 2, 3, corresponding to 
a decrease in uncertainty of the source redshift. 



redshift of the galaxy distribution (albeit only slightly). 

In Fig [5] we plot the constraints on wq and wi ob- 
tained with n = 1 and m = 1,2,3. This figure in- 
cludes the misidentified sources in order to illustrate how 
misidentified sources can bias the cosmology. Although 
the centers of the three one sigma regions (corresponding 
to three different values of m) predict a non-zero varia- 
tion in dark energy (non-zero wi), the fiducial ACDM 
model (wo = — 1 , wi = 0) does fall within the one sigma 
contour. Also note that larger values of to, correspond- 
ing to fewer misidentifications, are more consistent with 
the fiducial ACDM model. Note also that although the 
Icr contour increases slightly for larger m, this effect is 
pretty marginal. 

This may not be as surprising as it appears. The BBO 
beam is so narrow that the probability of finding a galaxy 
within it is small. The one sigma redshift range cor- 
responding to m = 1 would contain the source galaxy 
in roughly 68 per cent pencils, while in the remaining 
cases most pencils would not contain any galaxy at all. 
Therefore, the total number of misidentifications would 
be small, and their main role would be in biasing the 
DZ relation and not so much in controlling the tightness 
of fit. This is illustrated in Fig [SI which is identical to 
Fig[5]in all respects other than the fact that here we have 
removed the misidentified sources. The agreement with 
the fiducial model is slightly better in this case (especially 
for m = 2). The scatter in best fit value is due to the 
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FIG. 6: Same as figure [5] but after misidentifications have 
been removed in tlie resolved sample. The beam size is fixed 
at the one sigma angular resolution of BBO, corresponding 
to n = 1 in (llip . The lined region corresponds to m = 1, 
dark gray to m = 2, and light gray to m = 3 in (|12p . In all 
three cases the fiducial ACDM model {wo = —l,wi — 0) is 
included within the one sigma contour. Note that for m = 2 
(dark gray) the fit has improved substantially in comparison 
with figure [5] 
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FIG. 7: This figure shows the effect of including/excluding 
misidentified GW sources. The beam size is kept at the 
BBO one sigma angular resolution corresponding to n = 1 in 
and the redshift uncertainty of the source is two sigma: 
m — 2 in (|12p . In this figure the dashed region shows the 
fit with misidentified sources included and the gray region af- 
ter misidentified sources have been excluded. As expected, 
the one sigma contour is in better agreement with the fidu- 
cial ACDM model {wq = —l,wi =0) if misidentified sources 
have been excluded from the sample. 



slightly different number of resolved sources in the three 
cases, however, all three regions are consistent with each 
other. Fig [7] isolates the effect of including or exclud- 
ing the misidentified sources. It is clear from this figure 
that biasing, due to choosing too narrow a redshift range 
for the source redshift, is present but is smaller than the 
statistical errors on cosmological parameters. 

In Fig [8] we plot the constraints obtained by choosing 
different beams sizes but keeping the redshift error box 
at one sigma corresponding to m = 1 in (|12|) . The beam 
size chosen for different one sigma regions is n = 1,2, 3, 10 
in (ITT]) , and we find that the area under the one sigma 
confidence level increases with n. This is due to the fact 
that the total number of GPs decreases with increasing 
beam size. For this figure the misidentified sources were 
included in the determination of cosmological parame- 
ters. We find that even for n = 10, which is large enough 
to ensure that the true source would almost certainly fall 
within the beam, the constraints are sufficiently narrow, 
with very little bias, even though as Fig [9] illustrates, the 
total number of resolved sources in this case is smaller 
than what one finds for n = 1,2,3 (compare with figure 
S]); and the number of misidentifications is large, due to 
the fact that we have chosen m = 1 in ([T2l) . 

The left panel of Fig [10] plots the distribution of re- 
solved sources for n — 10 and m — 3, corresponding 
to the largest error box that we have considered. The 
number of resolved sources has decreased substantially. 



however, on the positive side, the number of misidenti- 
fications is much smaller than in Fig |9] Note also that 
there are almost no resolved sources beyond z = 1. The 
reason for this is the large source-redshift error at high 
redshift, which, when taken together with the fact that 
we are allowing the source to fall in the three sigma red- 
shift range, makes it very difficult to resolve sources. The 
right panel of the same figure shows the constraints ob- 
tained on cosmology. As expected, the constraints are 
poorer due to lack of sources beyond z — 1. 

The main reason for choosing large n is to ensure that 
the each beam contains the true source at a greater prob- 
ability. Our results show that for n = 10 and m = 1 the 
number of resolved sources is sufficient to give good con- 
straints on cosmology, however, the number of misidenti- 
fications is large. The bias thus produced is not large and 
it would seem that this configuration may give us good 
unbiased constraints on cosmology However, since our 
fiducial model has only two parameters, it is possible that 
the bias may make determining cosmology more difiicult 
for more complicated dark energy models. Therefore, it 
is necessary to keep the misidentifications small. It is 
difficult to make general statements regarding the opti- 
mum configuration for n and m for a more complicated 
dark energy model, however, our results suggest that the 
range 3 < n < 6 and 2 < m < 3 may work in most cases. 
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FIG. 8: In this figure we illustrate the effect of increasing the 
beam width on the reconstruction of the equation of state 
of dark energy. The uncertainty in source redshift remains 
fixed at m = 1 in (|12[) while the beam width increases as 
n = 1, 2, 3, 10 in (|lll) . which corresponds to la, 2a, 3a and 10a 
times the BBO value. White corresponds to n = 1, dark gray 
to n = 2, dashed to n = 3 and light gray to n = 10. Increas- 
ing the beam size decreases the overall number of resolved 
GW sources using which w{z) is reconstructed, and therefore 
slightly increases the area of the one sigma contour. It is in- 
teresting that the fiducial ACDM model {wo — — l,mi — 0) 
lies within the la contour for all values of n. 

IV. CONCLUSIONS 

To conclude, we would like to underscore the impor- 
tant point that our self-calibrating scheme works very 
well even if none of the gravity wave sources have ob- 
servable electromagnetic signatures. Indeed if the beam 
width is not too large (< ten times BBO), then the pres- 
ence of only a few gold plated sources (those whose red- 
shift has been independently established) in conjunction 
with the iterative procedure presented here, allows us to 
determine the equation of state of dark energy to an accu- 
racy of a few percent - see figure [51 The two main issues 
investigated in this work are the application of ideas pre- 
sented in S3 to simulated data, and to address the issue 
of misidentification of GW sources. Our simulations sim- 
plify details in two respects: first, we do not include the 
effect of clustering, second, we model the lensing scatter 
as a symmetric Gaussian distribution. (We shall assess 
the effects of asymmetry induced my lensing as well as 
the effects of clustering in a companion paper.) We find 
that the method works quite well on simulated data and 
the iterations saturate after a small number of steps. 

We showed that misidentified sources might in general 



bias the determination of DZ relation and would lead to 
a runaway process by which the DZ relation would move 
away systematically from the true cosmological DZ re- 




Redshift (z) 

FIG. 9: The normalized distribution of resolved sources at the 
end of our iteration. The solid line is the redshift distribution 
of all sources. The total fraction of resolved (light gray) and 
misidentified (dark gray) sources is plotted as a function of 
redshift for n = 10 in ([Tt} and m = 1 in (fT2)). 



lation, thus biasing the estimation of the cosmological 
parameters. In our simulations, the effect of biasing is 
present but is found to be small. We have shown that 
increasing the allowed redshift range for the GW source 
reduces this bias even further, without significantly af- 
fecting the cosmological constraints (due to a reduction 
in the number of resolved sources). A further positive 
conclusion is that the method works even for larger beam 
sizes, thereby addressing the concern that not all beams 
might contain the source galaxy. 

In our simulations the asymmetric lensing scatter 
has been modeled as a symmetric Gaussian distribu- 
tion. Lensing scatter has a large number of demagni- 
fied sources and a small number of compensating large 
magnifications. As mentioned in |l2!|, highly magnified 
sources would show up as outliers in the DZ plot and 
can be either removed from the sample or else, in the 
method that we propose, can be handled by choosing a 
small value of m. As our results show, the bias due to 
misidentifications is relatively small, so this choice would 
not cause significant distortions in the estimates of cos- 
mological parameters. 
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FIG. 10: Left panel displays the distribution of resolved sources (light grey) and misidentified sources (dark grey) for n = 10 in 
(|11[) and m = 3 in (|12|) . (Note the lack of resolved sources beyond z = 1.) The right panel shows confidence contours for the 
EOS of dark energy with n = 10 and m — 1 (light grey) and m = 3 (dark grey), respectively. 
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